Network traffic control in peer-to-peer environments

ABSTRACT

A method and an electronic unit are disclosed for controlling traffic on a network, especially for controlling peer-to-peer related traffic. A filter unit is intercepting messages related to peer-to-peer application from a network line, irrespective of the messages&#39; destination, A control logic then manages a request represented by an intercepted message subject to its content and subject to peering specific information.

TECHNICAL FIELD

The present invention relates to a network traffic control unit, anetwork comprising such a network traffic control unit, a method forcontrolling traffic on a network, and to a corresponding computerprogram product.

BACKGROUND OF THE INVENTION

Peer-to-peer applications become more and more popular since a widerange of data stored on computers on the edge of the Internet can now beaccessed. Computers that stored and provided data only for local accessand in addition provided means for retrieving data from Internet serversmay serve today as a data base for other computers and simultaneouslymay receive data not only from Internet servers but also from otherremote computers when executing peer-to-peer applications. This widensthe pool of accessible data tremendously.

Below, the term peer or node or peer node is used for an electronicdevice—for example a computer, a workstation or a PDA (personal digitalassistant) but not limited to—that can run a peer-to-peer application.Therefore, such node should be able to access a network in order toexchange information with other nodes.

Gnutella is currently one of the most prominent representative ofunstructured peer-to-peer applications, see “The Gnutella ProtocolSpecification v0.4 Document Revision 1.2”, retrieved on the Internethttp://www9.limewire.com/developer/gutella_protocol_(—)0.4.pdf andaccessed Nov. 15, 2002. These applications are called unstructured asnodes peer with other nodes in a random fashion. Searching inunstructured peer-to-peer network essentially is a random probing asresources such as files or other services are made available onarbitrary nodes in the network, see “Search and Replication inUnstructured Perr-to-Peer Networks”, Qin Lv et al., in 16th ACMInternational Conference on Supercomputing, June 2002. The mainadvantages of these systems are their simplicity, for example theprotocols used are very simple, and their dynamics in a sense that nodescan appear and disappear at a high rate. Another advantage is thatsearch queries can be almost arbitrary complex and includeskeyword-searching, substring-matching etc.

Peer-to-peer applications that include file exchange protocols—likeGnutella—Gnu V0.4 dynamically establish an “overlay” network to exchangeinformation. When a peer is started, it tries to peer with other peersusing a request/accept protocol. The requesting peer sends a “connectrequest” to another peer node. If this other peer authorizes theconnection it answers with a “connect accept” and the two partiesestablish a adjacency. Then they can start exchanging information thatget passed on to other peers.

FIG. 1 illustrates the way connectivity is achieved within an exemplarynetwork comprising at least two peer groups 1 and 2—also calledclusters. Each peer group 1 or 2 comprises peer nodes A, B, C,respectively D and E. Reference 3 indicates some physicalinterconnection (wire-bound, wireless) between peer groups 1 and 2.Arrows indicate an already established peering connection which israther a logical interconnection than a physical interconnection. Suchconnection is established by having one of the peers send a connectrequest message to the other peer and the other peer having acceptedthis connect request message with an accept message according to theprotocol of the peer-to-peer application.

In FIG. 1 a), node C is peered with nodes A and B, as well as node D ispeered with node E. Nodes C and D are prone to peering as C gets to knowabout D. Therefore C sends a connect request to D and D accepts bysending an accept message back to C.

According FIG. 1 b), C and D are now peered and C relays messages to D,the messages issued by A and B, whereas D relays messages to C, themessages issued by E. In the following, A D would like to peer as A getsto know about D. Therefore A sends a connect request to D and D acceptsby sending an accept message back to A.

According to FIG. 1 c), A and D are now peered in addition to thealready existing peering connections. However, there are now two logicalpeering connections existing on the physical interconnecting link 3.These two logical connections were established by means of at least fourmessages crossing the interconnection 3.

This overlay network—that is a term for the network of logicalconnections—is an ad-hoc network that does not rely on aninfrastructure. One well-known problem is how to bootstrap the peeringmechanism, that is how a peer can find addresses of other peers to peerwith. Usually two types of techniques are used to solve this problem.One solution consists of the peer connecting to a server located at awell-known address. This server maintains a list of peers' addressesthat are communicated to the peer. Another solution is for the peer tomaintain itself a list of other peer it peered with and use addressesfrom this list.

In account of the technique, peering is done based on a list ofaddresses without taking into the actual network infrastructure or theaffinity between peers. Therefore the resulting overlay network istypically totally de-correlated from the physical network. This can leadto a very inefficient use of the network resources and poor performancesof the file search protocol using this network.

Another example shown in FIG. 2 demonstrates an example how a physicalnetwork is flooded with peer-to-peer application messages in order toestablish adjacencies between peer nodes. Three clusters 1, 2, 4 areshown. Big circles represent physical network nodes (e.g. routers,gateways), whereas small circles represent peer nodes. Dotted linesrepresent physical interconnections between physical network nodes,whereas straight lines represent logical interconnections between peernodes. As can be derived from FIG. 2, peer node A is communicating withpeer node F only via peer nodes B to E, while they are adjacent in thephysical network. Note that in this example links joining cluster 1 tocluster 2 and cluster 1 to cluster 3 will easily get congested.

Structure of peer-to-peer applications thus result in a limitedscalability due to brute-force flooding and a clear misfit of theoverlay network topology with the underlying Internet topologyrepresenting the physical connections.

Flooding the underlying physical network with messages is not only aproblem when looking for other peers to peer with but also when queryingfor information, such as data files, once an overlay network isestablished.

FIG. 3 introduces such exemplary query process according to the protocolof a peer-to-peer application: FIG. 3 a) is similar to FIG. 1 a) andshows the establishment of a peering connection between nodes C and D.

According to FIG. 3 b), node A now issues a query request “Looking forvivaldi.mp3”. C forwards this query request to B and D, D forwards thisrequest to D. Arrows between two peer nodes pointing only in onedirection indicate the transmitted query requests.

E is supposed to have what A is looking for, so E sends a confirmationmessage to D, see FIG. 3 c). D knows that the confirmation is related toa request coming from C so sends the confirmation message to C. C knowsthat the confirmation message is related to a request coming from A sosends the confirmation message to A. Then, A contacts E using othermeans, e.g. HTTP, to get the file.

EP 1 229 442 A2 discloses a peer-to-peer protocol that is meant to beuniform fro many different peer-to-peer applications. There aredifferent layers defined, such as a platform core layer, a platformservices layer, and a platform application layer. Socalled rendezvouspeers can maintain dynamic indexes for entities in the peer-to-peerplatform including peers or peer groups. Rendezvous peers are consideredto be peers executing additional functions.

“Idebtifying and Controlling P2P/File-Saring Applications”, retrievedfrom the Internet http://www.allot.com/html/solutions_notes_kazaa.shtmand accessed Oct. 17, 2002, “Packeteer: Another take on limiting P2Ptraffic”, by Ann Harrison, retrieved from the Internethttp://www.nwfusion.com/newsletters/fileshare/2002/01297785.html andaccessed Oct. 17, 2002, “Four Steps to Application Performance acrossthe Network”, by Packeteer/™ Inc., retrieved from the Internethttp://www.packeteer.de and accessed Oct. 17, 2002, each disclose adevice that detects and identifies different types of traffic. In asecond step, network and application behaviour—especially bandwidthconsumption—is analyzed. According to the analysis, bandwidth isallocated to different applications.

EP 1 075 112 A1 describes a PNNI hierarchical network, whereby one ofthe peers represents a peer group as a peer group leader. The peer groupleader has a memory for storing peer group topology data.

Several approaches to limit peer-to-peer traffic were introduced thatare highly structured: “A scalable Content-Addressable Network”, by S.Ratnasamy et al., in ACM SIGCOMM, pages 161-172, August 2001; “Pastry:Scalable, decentralized object location and routing for large-scalepeer-to-peer systems”, by A. Rowstron and P. Druschel, in IFIP/ACMInternational Conference on Distributed Systems Platforms (Middleware),pages 329-350, November 2001; “Chord: A scalable Perr-to-peer LookupService for Internet Applications”, by I. Stoica et al., in Proceedingsof the 2001 ACM SIGCOMM Conference, pages 149-160, August 2001. Theseapproaches tightly control how and on which nodes information is stored.Also, peering of nodes is not random and the resulting overlay networksare often congruent to the underlying Internet topology. Thedisadvantage is that these approaches do not cope well with very highdynamics, i.e. a rapidly changing user population makes these systemsunstable. Furthermore, these systems excel in exact-match queries buthave some weaknesses in key-word based queries and substring queries.

Therefore, it is desired to have network traffic controlling meansprovided while having peers causing such traffic remaining unchanged.

SUMMARY OF THE INVENTION

According to one aspect of the invention, there is provided a networktraffic control unit, comprising a filter unit for intercepting messagesfrom a network line. Messages are intercepted relating to peer-to-peerapplication irrespective of the destination of a message. There isfurther provided a control logic that is configured for managing arequest represented by an intercepted message, subject to its contentand subject to peering specific knowledge the network traffic controlunit provides.

According to another aspect of the invention there is provided a methodfor controlling traffic on a network, comprising receiving messagesrelating to peer-to-peer application, intercepted by a filter unit froma network line, irrespective of the messages' destination, and managinga request represented by an intercepted message, subject to its contentand subject to peering specific information.

The filter unit filters messages that indicate in one way or anotherthat they are peer-to-peer application related. Peer-to-peerapplications typically enable user computers to act as both client andserver for data files or services to other user computers. In apreferred embodiment, the filter unit is checking port fields of TCPmessages with regard to appearance of defined port numbers in designatedport fields that indicate peer-to-peer application. A peer-to-peerapplication might use a port number to be identified that is differentto the port number of other peer-to-peer applications, and different toport numbers of other non peer-to-peer applications. However, othersignificant information of a message might be used to filterpeer-to-peer application related messages. The network traffic controlunit and its filter unit might be prepared to filter and then to controlonly messages related to a certain peer-to-peer application or might beprepared to filter and then to control messages of different knownpeer-to-peer applications. Messages not relating to a peer-to-peerapplication are typically not affected and can pass the filter unitunhamperedly.

The filter unit thus intercepts peer-to-peer application traffic on anetwork line irrespective of the destination of the messages. Thetraffic that is filtered is thus not directed to the IP or whateveraddress of the network traffic control unit but typically addressed topeer destinations. Nevertheless, the network traffic control unit isintercepting this kind of traffic in order to get control on it.

In order to achieve extended control on peer-to-peer traffic on anetwork, it is considered to be preferred to give the network trafficcontrol unit access to a network line that is carrying large amounts ofsuch traffic. A preferred network line to be accessed by the filter unitis an ingress/egress line to a group or cluster of peers, such that allor most of network traffic to or from peers of this clusters has to passthis network line and can be monitored.

On a lower level of a hierarchical communication layer, a message isrepresented by one or more data packets as indicated above when talkingabout TCP protocol. Other protocols of course may be used instead. Thefilter unit might be embodied as packet filtering logic implemented on anetwork processor. Since the network traffic control unit and theassociated proposed method have to primarily manage requests from peers,it is in particular appreciated to detect such requests. This detectioncan be implemented by the filter unit: For example, a request might beexpressed in the corresponding data packet with a defined code in adesignated field of the data packet. Then, the filter unit can beprepared to check this field for a given number of codes representing arequest. Other peer-to-peer application messages may also be filteredbut treated differently from a management point of view than requests.Alternatively, messages comprising peer-to-peer application requestsmight be detected by intercepting peer-to-peer application messages bymeans of the filter unit and having a command field of such messagesanalyzed by the control logic.

The control logic may be implemented in hardware or software or acombination thereof, or any other suitable implementation. A task thatis assigned to the control logic is to manage requests that areintercepted. Managing comprises, that such requests are now handled bythe control logic in a way that might be different to the way therequest pertaining peer-to-peer application envisages, but alsosatisfies the requesting peer, thereby preferably causing less trafficon the network than the peer-to-peer application would cause. Thenetwork traffic control unit therefore might preferably set up newmessages, redirect requests, interact with the requesting peer or peersintercepted messages are addressed to or even other network trafficcontrol units. These are only some actions a network traffic controlunit could provide, but not necessarily has to provide all of them. Theopportunities for managing requests are on the other hand not limited tothe enumerated actions.

Basically the control logic discovers the content of such interceptedmessage and coordinates measures to satisfy the needs expressed by suchmessage dependent on the content of the message and dependent onknowledge the network control unit has, either stored in a memory or byway of accessing other sources of knowledge. This knowledge is peeringspecific knowledge that helps in taking measures to satisfy queries,connect requests or other requests more efficiently. Typically, peers bythemselves do not have this knowledge available.

Thus, the invention allows the dramatic reduction of network trafficcaused by peer-to-peer applications by installing a network trafficcontrol unit that takes the lead in managing requests intercepted from anetwork line. Adding such smart control creates benefits in controllingand limiting peer-to-peer application initiated traffic. This can beachieved without changing or amending neither participating peers northe network structure and even without making the introduction of such anetwork traffic control unit public with the peers or other entitieswithin the network. The topology of the peer-to-peer overlay network isenhanced. Network control units can be added or removed without anyrequiring any changes to the peers.

The network traffic control unit can a stand alone electronic device inone preferred embodiment. In another preferred embodiment, the functionsof the network traffic control are added to the functions of a router,such that only one device is responsible for both, router and trafficcontrol functions.

According to many of the preferred embodiments introduced below, thecontrol logic is sending messages in order to manage requests. This hasto be interpreted such that the control logic primarily decides onsending messages, while the physical transmission of messages isinitiated by an interface that is controlled by the control logic.

In a preferred embodiment, the intercepted message is dropped. This stepis performed after having the content of the message evaluated. Droppingthe intercepted message expresses that the control logic takes controlfor further managing and thinking about new ways to handle the request.This is a first traffic limiting effort.

Preferably, a request to be managed is a connect request issued from apeer node and directed to another peer node. Such connect request issent in order to establish a connection to another peer, that mayprovide the contacting peer with the information or service thecontacting peer looks for after it accepted such connect request. It isimportant to have connect requests handled by the control logic of thenetwork traffic control unit, since such connect requests might causemany other succeeding connect requests between other peers, for examplewhen the peer-to-peer application determines to have a connected peersend connect requests to other peers he is aware of. By managing suchconnect requests and thus controlling actions for satisfying theserequests, the flood of peer-to-peer traffic can be containeddramatically.

A preferred way to manage a connect request is to handle further actionswith regard to already existing connections the network traffic controlunit is involved in. Whenever a peer is requesting connectivity to aanother peer, and the requesting peer is already connected to a thirdpeer, preferably of the same remote cluster, the network traffic controlunit might desist from sending a new request to this cluster, especiallywhen it is aware that the other peer is already connected to therequesting peer via the third peer.

In a preferred embodiment, the network traffic control unit thereforeprovides peering specific knowledge information on peer-to-peerconnections the network traffic control unit is currently aware of.

As indicated above, preferably no message might be sent to the addresseeof the intercepted connect request when a connection is alreadyestablished that can serve the requesting peer node.

In another preferred embodiment, the control logic initiates sending aconnect request to the originator of the intercepted connect request inresponse to the intercepted connect request. This is to fully getcontrol on the handling of the intercepted connect request. The networktraffic control unit sends this connect request with its own ID asoriginator. In the following, the requesting peer exclusivelycommunicates to the network traffic control unit. Traffic can becontrolled and limited effectively.

Where appropriate, the network traffic control unit sends a connectrequest with its own ID as originator to the addressee of theintercepted connect request. This might be reasonable in order tosatisfy the needs of the requesting peer as long as there is no otherconnection established in particular to this peer or in general to thiscluster. When there is a connection to another peer of this remotecluster, the network traffic control unit might prefer using theexisting connection to reach the requested peer instead of fulfillingthe original request to connect.

In another preferred embodiment, the network traffic control unit sendsa connect request to the addressee of the intercepted connect request,thereby pretending the originator of the intercepted connect request issending the connect request. This is an alternative method ofcontrolling the establishment of connections, when the network trafficcontrol unit is not appearing under its own identity.

It may be preferred, sending a connect request to a peer node other thanthe addressee of the intercepted connect request in response to theintercepted connect request. This other peer node might supportestablishing a connection to the requesting peer node. There might bedifferent reasons and strategies, when a connect request is redirectedby the network traffic control unit. Typically, the network trafficcontrol unit acts under its own identity when redirecting a connectrequest.

Especially when a connect request is directed to a peer of anotherremote cluster and another network traffic control unit is allocated tothis cluster, it is preferred that the local network traffic controlunit exclusively “talks” to peers of the other clusters via the remotenetwork traffic control unit. This limits traffic drastically. Such aconnect request to another network traffic control unit might also beadvantageous in order to receive peering specific information the othernetwork traffic control unit provides in preparation of connecting peersof the remote cluster.

When the network traffic control unit is intercepting a connect requestand acting in the following under its own identity, further actionsmight be preferably initiated only after the originator of theintercepted connect request accepts the connect request that is sent tohim from the network traffic control unit. This prevents generatingtraffic, when the originator is not prepared to communicate with thenetwork traffic control unit.

Especially for managing connect requests described above, the networktraffic control logic is preferably prepared to communicate according toa protocol of the peer-to-peer application.

Other requests that are preferably handled by the network trafficcontrol unit are data file queries issued by a peer node and brought tothe attention of the network traffic control unit by way of filtering.These query requests cause lots of succeeding traffic either, such thateffective management of handling such requests is vital for reducingoverall peer-to-peer induced traffic on the network. Typically, a queryrequest is sent after peers are connected in order to figure out whichof these online peers can provide the information the querying peer islooking for.

In a preferred embodiment, managing such a query request is subject toan index that allocates keys representing data files for download orrepresenting services to network traffic control units. This index isconsidered as peering specific knowledge. A key specifies at least apart of the content of a certain query and is generated from the contentof the respective query request according to fixed rules that thecontrol logic preferably implements. Having such a key derived from thequery request, the network traffic control unit derives from this index,which network traffic control unit among some or many network trafficcontrol units is responsible for administering information on this key.This information then maps peer nodes to keys. The mapped peer nodes arecurrently registered for providing a file the key stands for.

In a preferred embodiment of the invention, the peering specificknowledge a network traffic control unit provides comprises an indexthat allocates keys representing data files for download to networktraffic control units. This index is preferably locally stored in everynetwork traffic control unit and distributed regularly, respectivelyupdated on a regular or event driven basis. Every network trafficcontrol unit is responsible for administering information related to anumber of keys. As keys in the end represent information on queries andespecially on queried data files, every network traffic control unitadministers information on a number of different data files. Suchinformation, collected in another index then allocates peer nodes tokeys, giving thus detailed information which peer actually can provide acertain data file.

Since looking for a file or a service can be expressed in queries inmany different ways by different strings, the search strings of queryrequests are not very suitable for executing the query requestimmediately. Therefore, it is preferred, that one or more keys arederived from the content of the query request—that is particularly astring. The underlying set of rules is preferably stored by the networktraffic control unit; its control logic is configured for implementingsuch rules for deriving keys from query requests.

When such key or keys are derived from a query string by means of thecontrol logic, and when a network traffic control unit that administersthe keys is found by screening the corresponding index, a request isdirected to one or more remote network traffic control units that areallocated to the derived keys in order to obtain information which peershave the files represented by the keys available. The requested networktraffic control unit or units preferably send such information back tothe requesting network traffic control unit. A hit message from thenetwork traffic control unit to the querying peer node might then bepreferred for having the peer node select any number of data filesoffered. Many preferred variations of this process are introduced lateron.

Some network traffic control units therefore provide preferably akey-peer node index for some keys. These network traffic control unitsprovide other network traffic control units with the knowledge whichpeer nodes are allocated to a requested key according to the key-peernode index. Administration tasks of such a network traffic control unitpreferably include updating the index by adding and removing entries.

In another preferred embodiment, a way of updating indexes of peeringspecific knowledge is introduced: Hit messages sent from a peer nodeassociated to the network traffic control unit are monitored. One ormore keys are derived from the content of a hit message. The sendingpeer node is allocated to the derived keys, and the key-peer noderelation is stored in the key-peer node index at the network trafficcontrol unit that administers the index the key is part of. This methodhelps to keep peering specific knowledge up-to-date.

Preferably, such advanced search including underlying communicationbetween network traffic control unit as well as administering indexes,tables or other peering specific knowledge is accomplished using aprotocol different to the peer-to-peer application protocol.

Such protocol is more efficient and addresses the above mentionedpurposes. This protocol is specifically used for managing queryrequests.

For many purposes, it is preferred to have peering specific knowledgeavailable that comprises information on peer nodes associated to thenetwork traffic control unit. This helps optimizing managing efforts aspeer nodes of a joint cluster are typically located close to each other.Such distance information might affect managing requests by the networktraffic control unit.

According to another aspect of the invention, there is provided anetwork comprising at least one group of peer nodes, a network lineserving as ingress/egress line for this peer group, and a networktraffic control unit according to any one of claims referring to suchunit.

According to another aspect of the invention, there is provided acomputer program element comprising computer program code which, whenloaded in a processor unit of a network traffic control unit, configuresthe processor unit for performing a method as claimed in any one of themethod claims.

Advantages of the different aspects of the invention and theirembodiments go along with the advantages of the inventive networktraffic control unit and method described above.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention and its embodiments will be more fully appreciated byreference to the following detailed description of presently preferredbut nonetheless illustrative embodiments in accordance with the presentinvention when taken in conjunction with the accompanying drawings.

The figures are illustrating:

FIG. 1, a known way of establishing peer-to-peer connections over anetwork,

FIG. 2, a symbolic diagram of a network showing connections establishedaccording to a known peer-to-peer application,

FIG. 3, a known way of querying information according to a knownpeer-to-peer application,

FIG. 4 a), a diagram showing a network, in accordance with an embodimentof the present invention,

FIG. 4 b), a block diagram of a network traffic control unit, inaccordance with an embodiment of the present invention,

FIG. 4 c), a flow chart of a method for controlling traffic on anetwork, in accordance with an embodiment of the present invention,

FIG. 5, a diagram showing the way messages are exchanged, in accordancewith an embodiment of the present invention,

FIG. 6, a way of establishing peer-to-peer connections over a network,in accordance with an embodiment of the present invention,

FIG. 7, another way of establishing peer-to-peer connections over anetwork, in accordance with an embodiment of the present invention,

FIG. 8, a symbolic diagram of a network showing connections established,in accordance with an embodiment of the present invention,

FIG. 9, a block diagram of a network, in accordance with an embodimentof the present invention,

FIG. 10, a data structure a network traffic control unit provides aspeering specific knowledge, in accordance with an embodiment of thepresent invention,

FIG. 11, a flow chart showing a search for data files or services, inaccordance with an embodiment of the present invention, and

FIG. 12, a flow chart showing a method for updating peering specificknowledge.

Different figures may contain identical references, representingelements with similar or uniform content.

DETAILED DESCRIPTION OF THE DRAWINGS

FIG. 4 shows embodiments of different aspects of the present invention.FIG. 4 a) illustrates a chart of a network in accordance with anembodiment of the present invention, FIG. 4 b) a block diagram of anetwork traffic control unit in accordance with an embodiment of thepresent invention, and FIG. 4 c) a flow chart of a method of controllingtraffic in accordance with an embodiment of the present invention.

FIG. 4 a) shows two clusters 1 and 2. The clusters 1 and 2 arephysically connected via an interconnection 3. Cluster 1 comprises somerouters 10 and an edge router 11 being the router that is directlyconnected to the interconnection line 3—also called network line 3.Neither are shown the peers belonging to each clusters nor any logicalconnections between peers.

A network traffic control unit 5—also called booster—is introduced inthe network co-located with the network ingress/egress node—that is theedge router 11—, in order to control at least a large part of thetraffic the cluster 1 is transmitting and receiving. This traffic ispresent on the network line 3.

FIG. 4 b) shows a block diagram of the network traffic control unit 5.The network traffic control unit 5 comprises a filter unit 51, beingimplemented in a network processor. The filter unit 51 is monitoring thenetwork line 3 and filters all messages on this network line 3 thatrelate to peer-to-peer applications. Other messages—respectively thecorresponding data packets—are not affected and take their way todestination on network line 3. The network traffic control unit 5further comprises a control logic 52 that receives intercepted messagesand also has capability to send respectively initiate sending messagesover the network line 3. A memory 53 is provided for storing peeringspecific knowledge the control logic has access to.

FIG. 4 c) shows a flow chart of the way the network traffic control unit5 is having impact on messages that are sent the over network line 3,and especially to messages that represent a connect request sent from apeer of cluster 1. Overall principle here is controlling theestablishment of the overlay network topology by the network trafficcontrol unit 5 and thereby enhancing the performances of the protocol.

The basic principle is to intercept peer-to-peer connect requests issuedby cluster 1 peers and to force the requesting peer to peer with thenetwork traffic control unit 5. The interception is performed by thefiler unit 51. Only the network traffic control unit 5 peers withexternal peers and if necessary relays protocol message issued by thepeers located inside the network cluster 1. Whenever control logic 52takes the decision that according to the content of the interceptedrequest—that might be the request from peer X to connect to peer Y—andaccording to peering specific knowledge—that might be the information,that peer X is already connected to peer Y via another peer of the samecluster—the intercepted packet is dropped and no further action isrequired. This drastically limits the traffic on the ingress/egress linkand allows the protocol to scale.

FIG. 5 shows the protocol exchanges leading to the interception of theconnect request and how the booster peers with the requested peer. PeerA sends a connect request to peer B. Booster 5 intercepts since apeer-to-peer application message is detected. Originators address,addressee, and content of the message—that is a connect request—areextracted. Then the connect request in dropped. Booster 5 then issues aconnect request under its own identity to the peer A who is going toaccept it as it is looking for peer nodes.

The booster 5 then might peer with external peers or other boosters.This scheme can be extended using sophisticated information exchangesamong boosters to enhance the protocol's performances. For example,summaries of files available in the booster's network can be generatedusing distributed hash tables (e.g. CAN/Chord or Pastry/Tapestry).

FIG. 6 shows a diagram explaining the way messages are exchanged inaccordance with an embodiment of the present invention. It isillustrated how connectivity is achieved within an exemplary networkcomprising at least two peer groups 1 and 2. Each peer group 1 and 2comprises peer nodes A, B, C, respectively D and E. Reference 3indicates a network line between peer groups 1 and 2. Arrows pointing intwo directions indicate an already established peering connection whichis rather a logical interconnection based on some physicalinterconnection. Such connection is established by having one of thepeers sent a connect request message to the other peer and the otherpeer having accepted this connect request message with an accept messageaccording to the protocol of the peer-to-peer application.

In FIG. 6 a), node C is already peered with nodes A and B, as well asnode D is peered with node E. Nodes C and D are prone to peering as Cgets to know about D. Therefore C sends a connect request to D. Suchmessages are indicated by an arrow pointing from the originator to theaddressee. In the charts, only the originator of a message is indicatedverbally in brackets. But the real message also contains theidentification of the addressee.

A network traffic control unit 5 according to an embodiment of theinvention is introduced. The identifier of the network traffic controlunit 5 is “G”. It filters messages of peer-to-peer applications. Thus,the connect request from C to D is intercepted by the network trafficcontrol unit 5. Its information/content is extracted and the request isdropped. Now, network traffic control unit 5 takes full control onmanaging further actions in response to the intercepted connect requestto fulfill the needs of C: It therefore sends a connect request to C. Caccepts. According to peering specific knowledge the network trafficcontrol unit 5 has access to, it is still necessary to contact D.Therefore, network traffic control unit 5 sends a connect request to Dcontaining its identifier G. D accepts.

According to FIG. 6 b), C and G are peered now as well as G and D are.Now A and D would like to peer as A gets to know about D. Therefore Asends a connect request to D that is intercepted by G=network trafficcontrol unit 5. The request is dropped after extracting the message'scontent. G sends a connect request to A. A accepts.

There is no need for establishing a further logical connection between Aand D since C is already connected to D and A is connected to C. Thus,the network traffic control unit 5 takes no further action and inparticular does not send the connect request from A further or contactsD another time. According to FIG. 6 c), there is no further connectionbetween A and D, or G and D as a result. Thus, traffic is limited.

FIG. 7 shows basically an alternative to FIG. 5 with regard to the waymessages are exchanged. FIG. 7 illustrates the same network with thesame elements as FIG. 6. The way the network traffic control unit 5manages intercepted connect requests is now different: The networktraffic control unit 5 does not appear under its own identity butmanipulates in a more hidden way. Connect requests are still interceptedand dropped. After having evaluated peering specific knowledge and thecontent of the message, it might still be reasonable for network trafficcontrol unit 5 to contact D. But now, D is approached with a connectrequests that looks like the original one sent by C, showing C asoriginator instead of G. The accept message from D is also interceptedand an identical accept message is forwarded to C by the network trafficcontrol unit 5. This method is shown in FIG. 7 a). FIG. 7 b) illustratesthe actual connections of the overlay network afterwards.

FIG. 8 illustrates a diagram of a network as an embodiment of theinvention. Compared to the known network according to FIG. 2, nownetwork traffic control units 5 are installed at ingress/egress nodes.They are exclusively responsible for establishing the overlay networkwhich is indicated by straight lines. As can be derived from thediagram, there is only one connection between network traffic controlunits 5 of different clusters 1, 2, 4 established which reduces traffictremendously. Also, peer A can now peer with peer F via peer B withincluster 1. This knowledge was provided and applied by the networktraffic control units 5 while establishing connections and managingintercepted connect requests.

The scalability problems of unstructured peer-to-peer approaches canalternatively or in addition to be alleviated by replacing thebrute-force searching (querying) with an intelligent data locationmechanism. Thus, the network traffic control unit provides managingcapabilities for managing query requests that are intercepted andanalyzed. Again, the peers can remain unchanged whereas in the core ofthe network an advanced location mechanism is used. FIG. 9 shows anetwork according to an embodiment of the invention. The peers that arelocated within three peer groups 1, 2 and 4 are all named as GnutellaPeers, as in this embodiment the traffic related to Gnutellapeer-to-peer application should be managed. The network traffic controlunits 5 are located such that each network traffic control unit 5 isassociated with a peer group such that a network traffic control unit 5has access to all sent or received messages the peers of his group areinvolved in when communicating to peers of other peer groups. Withregard to physical network topology, network traffic control unitstypically sit between access and edge routers such that they mayintercept peer-to-peer messages. Like this, a network traffic controlunit serves a number of peers in its vicinity to which it is networkclose.

Among the network traffic control units 5 and especially for managingintercepted query requests in an intelligent low-traffic way, a protocoldifferent to the peer-to-peer protocol is used in order to better matchthese new requirements. Such protocol may be named “Advanced Search andLocation Protocol”. The advantages of this approach are: In the core ofthe network, flooding is replaced by a scalable advanced locationmechanism. This improves scalability and significantly reduces theamount of control traffic. The peers do not have to be replaced orchanged. In particular, a highly dynamic peer population is supported.The network traffic control units 5 are relatively stable and thusfulfill the requirements of structured peer-to-peer systems. Theyprotect the network from flooding messages and use an advanced locationmechanism instead.

In the following, the architecture is explained using the Gnutellaprotocol as an example for a peer-to-peer application protocol. Otherunstructured peer-to-peer approaches work similarly. The peers executethe Gnutella protocol to locate files and execute HTTP to downloadfiles. Network traffic control units also implement the Gnutellaprotocol in order to communicate with the peers. Unlike standard peers,network traffic control units do not participate in flooding Gnutellarequests. Between network traffic control units, an advanced locationmechanism is used.

A part of the control logic managing capabilities is preferably theapplication of a set of rules for translating the content of queries—andespecially the strings of such queries representing the content—intokeys. Keys are more easy to query and less vague in representing astatement than language is. In addition, keys are more short thanstrings and therefore need less bandwidth.

A given input string is first processed by a stop-word filter thatremoves all words that are insignificant for the search. A parser thengenerates a set of hash-codes—that are regarded as specialimplementation of keys using hash-functions—from the remaining words ofthe query. In the most simple case, the parser generates a singlehash-code from each word. A sophisticated parser maps content to ahierarchical structure, for instance

-   -   filetype=“music”    -   format=“mp3”    -   artist=“vivaldi”    -   conductor=“karajan”        and allocates a key to this structure.

For each valid sub string of a query, additional keys might be computed.This allows to implement sub string queries. Details of how this couldbe achieved are described in “A Scalable Peer-to-Peer Architecture forIntentional Resource Discovery” by Magdalena Balazinska, et al.,Pervasive 2002—International Conference on Pervasive Computing, August2002, which is hereby incorporated by reference.

The resulting keys are used to retrieve information from a distributedkey—peer index. The key—peer index is distributed among network trafficcontrol units in a way that every network traffic control unit maintainsonly a part of the overall key—peer index. A single network trafficcontrol unit is administering a limited number of keys. Such key-peerindex maps peers that store the file the key is related to.

FIG. 10 illustrates data structure of such a key-peer index stored on asingle network traffic control unit. The network traffic control unitkeeps a fraction of the overall distributed key-peer index. A given keymaps to none, one or more filenames and the corresponding file is storedon one or several peers.

Basic logic of managing queries that is implemented in the control logicof a network traffic control unit includes preferably:

For a given key, the control logic locates the network traffic controlunit on which more information associated with this key are stored.There is preferably an index or a function available mapping keys tonetwork traffic control units. A key and its associatedinformation—especially peer and filename, and possibly the networktraffic control unit that is associated to the peer—might be stored onmultiple network traffic control units. In this case, the control logiclocates the one which is closest to the requesting network trafficcontrol unit. The key-network traffic control unit index is dynamic inthe sense that new network traffic control units can be added andexisting ones can be removed. Compared to the change rate of the peers,the change rate of the network traffic control units is expected two ofmagnitude lower.

The result of a key query is a list of filenames which are then returnedto the peer that originally issued the query. Optionally these filenamesmay be compared by the network traffic control unit against thefilenames extracted from the original query in order to produce aranking list, which is then returned to the peer. Also, due to theinherent nature of generating keys and especially hashing there isalways a non-null chance of two different inputs being mapped into thesame key. This also might cause preference to double check, e.g. by wayof comparing original filename and returned filename.

FIG. 11 illustrates a flow chart of managing a query request triggeredby a peer that sends a query message 100. The network traffic controlunit might be a peer and therefore receives the message or alternativelyintercepts the message by diverting peer-to-peer traffic—here Gnutellatraffic—from the network.

The network traffic control unit then proceeds as follows:

It computes a set of hash codes 103 based on the search string. For eachvalid sub string, a hash code is computed, too. This allows to implementsub string queries. Stop-word filter and name parser techniques areapplied for generating the hash-codes, step 101, 102. These steps can ofcourse have different order.

The network traffic control unit locates the destinations in terms ofother network traffic control units where the derived hash-codes/keysare administered. This is achieved by means of a key-network trafficcontrol unit index. A query message for the computed keys is sent fromthe managing network traffic control unit to the discovered remotenetwork traffic control units. Upon reception of such message, theremote network traffic control unit will return values associated withthe queried key, step 104. These values comprise of a list of Peers thatstore the requested file. For each peer, the associated network trafficcontrol unit is also listed.

Whenever the specific keys are stored on the managing network trafficcontrol unit, there is of course no need for contacting other networktraffic control units and the peers that are associated to the queriedkeys can be detected on the local network traffic control unit.

The returned keys might be translated into strings and be compared tothe original query string sent by the requesting peer, step 105.

If the list is not empty, then the managing network traffic control unitreturns a hit message to the querying peer and gives itself as source,step 106. A peer might then be free to chose among the returnedfilenames and select any number of them for retrieval.

If either a push request or a HTTP get request arrives at the networktraffic control unit, then the network traffic control unit selects thereal data source—i.e. a specific peer—to retrieve the file from. Asselection criteria it may use the quality of the network connection(delay, throughput, error rate, or other parameters that might beevaluated by the network traffic control units) and freshness ofinformation. The network connection for retrieving is not offered to theoffering peer but instead to its associated network traffic controlunit, as the two network traffic control units are assumed to becorrelated. In locality-aware DHT's, evaluation of the networkconnection is delivered without any additional overhead.

After the serving peer is chosen, the managing network traffic controlunit starts retrieving the file from this peer via the associatednetwork traffic control unit, then directly forwarding it to thequerying peer.

In case an entry is not available—that might mean the peer that waslisted as offering a file has disappeared or no longer has this file—anexplicit removal procedure as described below is executed to disseminatethis new information to the other network traffic control units. Also,the managing network traffic control unit may decide to retrieve fromanother—for example the next best—offering peer instead, or directlynotifying the querying peer of the failed attempt.

FIG. 12 depicts a flow chart for an insert operation that is based onmonitoring hit messages that are forwarded through the network. Theinsert method is managed by the control logic of an network trafficcontrol unit.

As soon as a network traffic control unit observes a hit answeraccording to the peer-to-peer application protocol that originates fromone of its local peers a new entry in the distributed key-peer index hasto be performed as follows:

The network traffic control unit computes a set of keys, step 201, 202based on the filename 200. This computation is analogous to the one usedfor the search operation. For each key that is computed, the managingcontrol logic identifies a network traffic control unit that isadministering this key. If that key is not existing yet, it is assignedto one of the network traffic control units. The administering networktraffic control units then stores the key together with the filename,the address of the peer as well as the IP address of the associatednetwork traffic control unit.

A remove operation that also updates the key-peer index can include ofan implicit and/or explicit part. Implicit removals occur if an indexentry wasn't accessed for some time. Time outs are expected to be in therange of one or more hours. Implicit removals are done by each networktraffic control unit individually by periodically checking entries thathave timed out. Explicit removals occur when a download from an indexentry did not succeed either because the file or the peer hasdisappeared. In this case, the index entry is removed on all networktraffic control units that store the associated key.

1. Network traffic control unit, comprising a filter unit (51) forintercepting messages relating to peer-to-peer application, from anetwork line (3), irrespective of destination, a control logic (52) thatis configured for managing a request represented by an interceptedmessage subject to its content and subject to peering specific knowledgethe network traffic control unit (5) provides.
 2. Network trafficcontrol unit according to claim 1, wherein the network traffic controlunit (5) is prepared to communicate according to a peer-to-peerapplication protocol.
 3. Network traffic control unit according to claim2, wherein the network traffic control unit (5) is prepared to apply thepeer-to-peer application protocol for managing connect requests. 4.Network traffic control unit according to any one of the claims 1 to 3,wherein the network traffic control unit (5) is prepared to communicateaccording to a protocol different to the peer-to-peer applicationprotocol.
 5. Network traffic control unit according to claim 4, whereinthe network traffic control unit (5) is prepared to apply the protocoldifferent to the peer-to-peer application protocol for managing queryrequests.
 6. Network traffic control unit according to any one of thepreceding claims, wherein the peering specific knowledge comprisesinformation on peer-to-peer connections the network traffic control unit(5) is currently aware of.
 7. Network traffic control unit according toany one of the preceding claims, wherein the peering specific knowledgecomprises information on peer nodes associated to the network trafficcontrol unit (5).
 8. Network traffic control unit according to any oneof the preceding claims, wherein the peering specific knowledgecomprises an index that allocates keys representing data files fordownload to network traffic control units.
 9. Network traffic controlunit according to any one of the preceding claims, wherein the peeringspecific knowledge comprises an index that allocates peer nodes to keysrepresenting data files for download.
 10. Network traffic control unitaccording to any one of the preceding claims, wherein the control logic(53) is configured for implementing a set of rules for deriving keysfrom intercepted query requests.
 11. Method for controlling traffic on anetwork, comprising: receiving messages related to peer-to-peerapplication, intercepted by a filter unit from a network line (3),irrespective of the messages' destination, managing a requestrepresented by an intercepted message subject to its content and subjectto peering specific information.
 12. Method according to claim 11,comprising dropping the intercepted message.
 13. Method according toclaim 11 or claim 12, wherein a request to be managed is a connectrequest issued from a peer node and directed to another peer node. 14.Method according to claim 13, wherein managing the connect request issubject to existing connections the network traffic control unit isaware of.
 15. Method according to claim 14, wherein no message is sentto the addressee of the intercepted connect request when a connection isalready established that can serve or be extended to serve therequesting peer node.
 16. Method according to any one of the claims 13to 15, comprising sending a connect request to the originator of theintercepted connect request in response to the intercepted connectrequest.
 17. Method according to one of the claims 13, 14 or 16,comprising sending a connect request to the addressee of the interceptedconnect request.
 18. Method according to one of the claims 13, 14 or 16,comprising sending a connect request to the addressee of the interceptedconnect request pretending the originator of the intercepted connectrequest is sending the connect request.
 19. Method according to one ofthe claims 13 to 16, comprising sending a connect request to a peer nodeother than the addressee of the intercepted connect request.
 20. Methodaccording to one of the claims 13 to 16, comprising sending a connectrequest to another network traffic control unit (5).
 21. Methodaccording to claim 16 in combination with any one of the claims 17 to20, sending the connect request to another party than the originator ofthe intercepted connect request once the originator has accepted theconnect request from the network traffic control unit directed to theoriginator.
 22. Method according to any one of the preceding claims 11to 21, wherein a request to be managed is a data file query issued by apeer node.
 23. Method according to claim 22, wherein managing the queryrequest is subject to an index that allocates keys representing datafiles for download to network traffic control units.
 24. Methodaccording to claim 22 or claim 23, wherein managing the query request issubject to an index that allocates peer nodes to keys.
 25. Methodaccording to any one of the claims 22 to 24, comprising deriving one ormore keys from the content of the query request.
 26. Method according toclaim 25, comprising directing a request to one or more remote networktraffic control units that are allocated to the derived keys accordingto the key-network traffic control unit index.
 27. Method according toclaim 26, comprising receiving a list of peer nodes that are allocatedto the keys, from the remote network traffic control unit.
 28. Methodaccording to claim 27, comprising sending a hit message to the queryingpeer node.
 29. Method according to any one of the preceding claims 11 to28, comprising administering a key-peer node index for some keys, andproviding other network traffic control units on request with theknowledge which peer nodes are allocated to a requested key according tothe key-peer node index.
 30. Method according to claim 29, whereinadministering the key-peer node index comprises removals of entries. 31.Method according to any one of the preceding claims 111 to 30,comprising monitoring hit messages sent from an associated peer node,deriving one or more keys from the content of a hit message, allocatingthe sending peer node to the derived keys, and storing the key-peer noderelation in a key-peer node index.
 32. A network comprising at least onegroup (1, 2, 4) of peer nodes, a network line (3) serving asingress/egress line for this peer group (1, 2, 4), and a network trafficcontrol unit (5) according to any one of the preceding claims 1 to 10,intercepting messages from the network line.
 33. A computer programelement comprising computer program code which, when loaded in aprocessor unit of a network traffic control unit, configures theprocessor unit for performing a method as claimed in any one of claims11 to 31.